LedPred: Learning from DNA to Predict enhancers

نویسندگان

  • Elodie Darbo
  • Denis Seyres
  • Aitor Gonzalez
چکیده

2 Description 2 2.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.2 Learning from CRM-contained information to predict new regulatory features . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.2.1 Building the training set . . . . . . . . . . . . . . . . . . . 2 2.2.2 Optimization of support vector machine model . . . . . . 3 2.2.3 Definition of the optimal SVM parameters (γ and C) . . . 3 2.2.4 Sorting and selecting features according to their importance in the traing set description . . . . . . . . . . . . . 4 2.2.5 Plotting the performances of the model . . . . . . . . . . 4 2.3 Function description . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3.1 From bed CRM genomic coordinates to training set matrix 5 2.3.2 Scaling of the CRM feature matrix . . . . . . . . . . . . . 5 2.3.3 SVM parameter optimization . . . . . . . . . . . . . . . . 5 2.3.4 Features ranking . . . . . . . . . . . . . . . . . . . . . . . 6 2.3.5 Selecting features . . . . . . . . . . . . . . . . . . . . . . . 7 2.3.6 Creating the best model . . . . . . . . . . . . . . . . . . . 7 2.3.7 Plotting model perfomance . . . . . . . . . . . . . . . . . 8 2.3.8 Using the model to score unknown sequences . . . . . . . 8 2.3.9 From the matrix to the model in one function . . . . . . . 9

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DELTA: A Distal Enhancer Locating Tool Based on AdaBoost Algorithm and Shape Features of Chromatin Modifications

Accurate identification of DNA regulatory elements becomes an urgent need in the post-genomic era. Recent genome-wide chromatin states mapping efforts revealed that DNA elements are associated with characteristic chromatin modification signatures, based on which several approaches have been developed to predict transcriptional enhancers. However, their practical application is limited by incomp...

متن کامل

Correction: LMethyR-SVM: Predict Human Enhancers Using Low Methylated Regions based on Weighted Support Vector Machines

BACKGROUND The identification of enhancers is a challenging task. Various types of epigenetic information including histone modification have been utilized in the construction of enhancer prediction models based on a diverse panel of machine learning schemes. However, DNA methylation profiles generated from the whole genome bisulfite sequencing (WGBS) have not been fully explored for their pote...

متن کامل

Integrating Diverse Datasets Improves Developmental Enhancer Prediction

Gene-regulatory enhancers have been identified using various approaches, including evolutionary conservation, regulatory protein binding, chromatin modifications, and DNA sequence motifs. To integrate these different approaches, we developed EnhancerFinder, a two-step method for distinguishing developmental enhancers from the genomic background and then predicting their tissue specificity. Enha...

متن کامل

Genome-wide enhancer prediction from epigenetic signatures using genetic algorithm-optimized support vector machines

The chemical modification of histones at specific DNA regulatory elements is linked to the activation, inactivation and poising of genes. A number of tools exist to predict enhancers from chromatin modification maps, but their practical application is limited because they either (i) consider a smaller number of marks than those necessary to define the various enhancer classes or (ii) work with ...

متن کامل

A synergistic DNA logic predicts genome-wide chromatin accessibility.

Enhancers and promoters commonly occur in accessible chromatin characterized by depleted nucleosome contact; however, it is unclear how chromatin accessibility is governed. We show that log-additive cis-acting DNA sequence features can predict chromatin accessibility at high spatial resolution. We develop a new type of high-dimensional machine learning model, the Synergistic Chromatin Model (SC...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016